The kind of object-oriented programming most people are familiar with is usually in the style of Java. It's unfortunate because the original concept of object-oriented programming didn't work that way — it's really just a subset of functional programming.
It's important to know a bit about the history of Erlang in order to understand why some of the design decisions were made. Erlang was designed at Ericsson for their work on telephony switches.
The best example is the often reported nine nines (99.9999999%) of availability offered on the Ericsson AXD 301 ATM switches, consisting of over 1 million lines of Erlang code.
- Learn You Some Erlang
All of it's features were designed with practicality in mind, since they were dealing with real world problems: transparent distribution (horizontal and vertical scaling), code hot-swapping, asynchronicity, soft realtime, granular supervision, fault tolerance, load balancing, live debugging, etc.
Erlang is viewed as a functional language because all data structures are immutable (immutability greatly simplifies concurrent programming), pattern matches on messages, and has first class functions. But it's actually multi-paradigm. Some argue that the real benefit of object oriented programming was the message-passing/actor paradigm that came from Smalltalk and Erlang.
Actors are the Erlang style of object oriented programming. They are treated as black boxes — no actor can mess with the internals of another actor. Instead actors communicate with other actors asynchronously via message passing.
The Erlang engineers came to a lot of great insights about distributed programming, but two really stick out: transparent distribution and fault tolerance.
What they realized was that processes are distributed along multiple machines in large-scale real world applications. Even if they are co-located on the same machine, often times they are basically a black box because it's closed source, or essentially closed source because it's unknown code, or it might migrate to another machine eventually.
This is where the idea of transparent distribution comes from. They realized that they could greatly simplify scalability this way, as it doesn't matter if an actor is on your machine or on a remote machine, or if it's a small thing, or if it grew into a large thing.
This is where the concept of encapsulation in object oriented programming came from. The contents are kept hidden because you can't see those things, they're on another machine! and even if you could see it today, you might not be able to see it tomorrow as things scale.
This is also why actors communicate asynchronously via message passing as well — you might be tempted to do things syncronously when everything is on the same machine today because latency is virtually non-existent, but tomorrow that might not be the case. It's bad news if you need to redesign everything as things start to scale.
Another major insight the Erlang engineers had was that in computing, especially distributed computing, things fail. Types won't help you at all if the hardware fails, or someone trips on a wire, or a solar storm happens, or a hurricane — you get the idea.
Erlang's approach to fault tolerance is inspired by Common
Lisp's conditions and restarts system (I'll cover
Lisp in another blog post). Erlang errors
are propagated across actors as signals, and these signals can be actual
UNIX signals like SIGINT
and SIGTERM
. Some of
these signals won't kill an actor, they can be safely ignored, while
some of them such as exit signals will. This is just like Common
Lisp conditions.
In order to prevent an actor from getting killed by an exit signal, the exits (errors) need to be trapped (caught). Trapping converts the signal into an actual message that the actor can handle. These are like restarts in Common Lisp.
Learn You Some Erlang - It's a trap:
1> process_flag(trap_exit, true).
true
2> spawn_link(fun() -> linkmon:chain(3) end).
<0.49.0>
3> receive X -> X end.
{'EXIT',<0.49.0>,"chain dies here"}
It's very important to understand that this is not like exception handling in more mainstream languages like Java. When an exception is caught in Java, the block continues off at a lower point in the call stack, so exception handling is mostly there just to gracefully die or to completely restart a part of the program from scratch.
Restarts in Common Lisp and Erlang are different. They allow you to restart at the highest point possible of the program and resume with all of the state still intact! In Common Lisp, it resumes at the stack frame and place where it errored with all the state still there. In Erlang, restarts are resumed at the highest point of the program, as a message to be read by the mailbox, and also with all the state because it's treated like a regular message by the actor.
There is another feature of Erlang, fast restarts — when things go down, they should be quick to spin back up for maximum reliability. Erlang's virtual machine, BEAM, has a construct specially designed for that: erlang processes.
These aren't like OS processes or threads, but more like something in between. You can't share state between them except via message passing, and they have unique identifiers that can span across machines — sort of like a super OS process, but they're like threads in that they're lightweight and really fast to spin up.
The Erlang engineers realized that a lot of common patterns with similar solutions keep coming up, so they created a set of libraries collectively called OTP to work well with a set of given common patterns called behaviors (which are similar to interfaces).
They also realized that while trapping signals is a great technique, sometimes processes fail for unforseen reasons, or they're better off failing entirely and restarted from scratch. Supervisors come from OTP, and they're meant to address this. A supervisor actor can supervise one or more child actors and perform some action(s) if they fail. Stacking supervisors on top of each other (a supervisor supervising supervisors, etc) creates a supervision tree. This is how Erlang creates highly fault tolerant systems.
Learn You Some Erlang - Supervisors:
-module(band_supervisor).
-behaviour(supervisor).
-export([start_link/1]).
-export([init/1]).
start_link(Type) ->
supervisor:start_link({local,?MODULE}, ?MODULE, Type).
%% The band supervisor will allow its band members to make a few
%% mistakes before shutting down all operations, based on what
%% mood he's in. A lenient supervisor will tolerate more mistakes
%% than an angry supervisor, who'll tolerate more than a
%% complete jerk supervisor
init(lenient) ->
init({one_for_one, 3, 60});
init(angry) ->
init({rest_for_one, 2, 60});
init(jerk) ->
init({one_for_all, 1, 60});
Ocaml has mutable objects similar to Java except that mutation is more principled and explicit. And most Ocaml engineers use modules instead, which are like a more powerful expressive objects, to codify similar patterns.
Mutable values in Ocaml objects have to be defined explicitly using the
mutable
keyword:
let stack init = object
val mutable value = init
method pop =
match value with
| hd :: tl ->
value <- tl;
Some hd
| [] -> None
method push hd =
value <- hd :: value
end
Mutability is also more disciplined in Ocaml — the
mutable
value in the above example can't spill out of
the object, it can only live inside the object instance. This is like
Clojure's
transients but
for objects.
Another similar construct is ref
s. This is also more
principled and disciplined than mutation in Java because
ref
is a boxed type — an int ref
, for example,
can't be passed to a function that expects int
, it has
to be dereferenced first. This way mutation doesn't accidentally
spill out into the rest of your program, it's only where it's
supposed to be:
let hi: string ref = ref "hello"
let f (x: string) = print_endline x
(* This won't compile:
let _ = f hi
It'll error out with:
This has type:
string ref
But somewhere wanted:
string
*)
(* Dereferencing hi with an ! will take out the value *)
let _ = f !hi
Type signatures are optional in Ocaml, they're only annotated in the example for clarity.
Objects don't need to be mutable, even in Java. However, without the right language constructs it'll feel really awkward:
let immutable_stack init = object
val value = init
method pop =
match value with
| hd :: tl -> Some (hd, {< value = tl >})
| [] -> None
method push hd =
{< value = hd :: value >}
end
The {< ... >}
is syntax that produces a copy of the
current object with the specified fields updated. Immutable objects are
interesting but rarely used because Ocaml has such a powerful module
system.
You can implement the same mutable stack example from earlier, but this isn't really what modules are good at:
module type INIT = sig
type 'a t and a
val value : a t
end
module type STACK = sig
type 'a t and a
val pop : unit -> a option
val push : a -> unit
end
module Stack (Init: INIT with type 'a t = 'a list):
STACK with type a = Init.a and type 'a t = 'a Init.t = struct
type 'a t = 'a Init.t and a = Init.a
let value = ref Init.value
let pop () =
match !value with
| hd :: _ -> Some hd
| [] -> None
let push hd =
value := hd :: !value
end
It's possible just more verbose. There are some advantages where modules allow you to express more complex polymorphic mutable containers that regular objects can't do, but these are advanced situations and rare. If you need to implement a stack you're better off using objects.
The module system is very useful for code organization:
module Swatch = struct
module Color = struct
type foreground = [ `white | `lightgrey ]
type background = [ `black | `darkgray ]
end
module Size = struct
let small = `px 12
let medium = `px 16
let large = `px 24
end
module Font = struct
type value = [ `fira_sans | `fira_code ]
type t = [ `font of value ]
let main = `font `fira_sans
let code = `font `fira_code
end
end
(* stubbed function that uses a font *)
let f (_: Swatch.Font.t) = ()
let _ = f Swatch.Font.code
It's also useful for situations where a module naturally subdivides into multiple sections, but the sections aren't important enough to get their own file (yet). This way you can grow your app in a very natural way. You can nest modules as much as you want.
This is another great use case for modules: implementing generic functions that operate on any type as long as they implement a specific interface.
In this example there's a module type
(basically an
interface) called APPENDABLE
. Any type that has an
empty
value and a way to append
values
together can implement that interface. For example, string
s
have an empty
value of ""
, and
append
would concatenate strings together. And
int
egers have an empty
value of
0
and append
would add integers together.
Here I'm defining a Collector
module, that defines a
collect
function. That function will take a list of
anything that's APPENDABLE
and accumulate the
values together:
module type APPENDABLE = sig
type t
val empty : t
val append : t -> t -> t
end
module Collector (T: APPENDABLE) = struct
let collect (x: T.t list) =
x |> List.fold_left T.append T.empty
end
module StringCollector = Collector (struct
type t = string
let empty = ""
let append a b = a ^ b
end)
module IntCollector = Collector (struct
type t = int
let empty = 0
let append a b = a + b
end)
(* prints: "foobarbaz" *)
let _ =
["foo"; "bar"; "baz"]
|> StringCollector.collect
|> print_endline
(* prints: 10 *)
let _ =
[1; 2; 3; 4]
|> IntCollector.collect
|> string_of_int
|> print_endline
Note the syntax on Collector
. This is called a functor,
which is like a module-function — it takes one or more modules (in this
case one) and returns a module.
Ocaml modules are first-class, which means you can pass them around in functions. First class modules pass around type information, so you can have a container that normally would only take a homogenous type, like an array of strings or a list of integers, can have it take in different types:
module type V = sig
type t
val x : t
val to_string : t -> string
end
(* A collection of heterogeneous modules! *)
let abc: (module V) list =
[
(module struct
type t = int
let x = 123
let to_string = string_of_int
end);
(module struct
type t = string
let x = "hi"
let to_string x = x
end);
(module struct
type t = bool
let x = true
let to_string = string_of_bool
end)
]
(* Iterate over them and print the string value *)
let _ =
abc
|> List.iter @@ fun item ->
let module M = (val item : V) in
print_endline @@ M.to_string M.x
You might wonder how that's possible since Ocaml is strongly typed.
Ocaml only supports upcasting, so the type information in the module
becomes the abstract type t
unless you explicitly use
with_type
(the polymorphic type becomes a rigid type to
preserve type safety), which is why a to_string
function is
provided.
You can store the type information in a variant though if you need to. Or if it's a finite set you can provide extra type info and store it in a tuple:
let xyz =
(
(module struct
type t = int
let x = 123
let to_string = string_of_int
end : V with type t = int)
, (module struct
type t = string
let x = "hi"
let to_string x = x
end : V with type t = string)
, (module struct
type t = bool
let x = true
let to_string = string_of_bool
end : V with type t = bool)
)
(* val xyz : (module V with type t = int)
* (module V with type t = string)
* (module V with type t = bool) *)
You might wonder why you would use a functor over a first-class module,
since both can do the same thing of taking modules as arguments and
returning modules. There are specific cases where the Ocaml compiler is
stricter with first class modules. It will throw an error that a value
will escape it's scope. Otherwise, first class modules might be more
expressive. This is the Collector
example from earlier with
first class modules:
module type EMPTY = sig
type t
val empty : t
end
module type APPENDABLE = sig
type t
val empty : t
val append : t -> t -> t
end
module type APPENDABLE_V = sig
include APPENDABLE
val value : t
end
let empty (type a) empty: (module EMPTY with type t = a) =
(module struct
type t = a
let empty = empty
end)
let appendable (type a) empty append (value: a):
(module APPENDABLE_V with type t = a) =
let module Empty = (val empty: EMPTY with type t = a) in
(module struct
include Empty
let append = append
let value = value
end)
let collect (type a) empty x: a =
let module E = (val empty: EMPTY with type t = a) in
x
|> List.fold_left (fun acc item -> begin
let module I = (val item: APPENDABLE_V with type t = a) in
I.append acc I.value
end)
E.empty
module Empty = struct
let string = empty ""
and int = empty 0
and bool = empty true
end
let string = appendable Empty.string (^)
let int = appendable Empty.int (+)
let bool = appendable Empty.bool (||)
let x: string =
["foo"; "bar"; "baz"]
|> List.map string
|> collect Empty.string
(* prints: "foobarbaz" *)
let y: int =
[123; 456; 789]
|> List.map int
|> collect Empty.int
(* prints: 1368 *)
let z: bool =
[true; false; true]
|> List.map bool
|> collect Empty.bool
(* prints: true *)